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Abstract We show that the monthly recorded history (1878-2013) of the 
Southern Oscillation Index (SOI), a descriptor of the El Nino Southern Oscil¬ 
lation (ENSO) phenomenon, can be well described as a dynamic system that 
supports an average nonlinear predictability well beyond the spring barrier. 
The predictability is strongly linked to a detailed knowledge of the topology 
of the attractor obtained by embedding the SOI index in a wavelets base state 
space. Using the state orbits on the attractor we show that the information 
contained in the Southern Oscillation Index (SOI) is sufficient to provide av¬ 
erage nonlinear predictions for time periods of 2, 3 and 4 years in advance 
throughout the 20th century with an acceptable error. The simplicity of im¬ 
plementation and ease of use makes it suitable for studying non linear pre¬ 
dictability in any area where observations are similar to those that describe 
the ENSO phenomenon. 
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1 Introduction 

The El Nino Southern Oscillation is a large scale tropical Pacific atmosphere- 
ocean phenomena. m. ®, 0 one of the stronger quasi-oscillatory pattern 
observed in the climate system that induces important changes in the global 
Earth’s gravity field [3], influence precise variations of geodetics parameters 
such as Geocenter [5j may play a major role in the generation of seismicity 
[6], or serve as a modulator of the external sun’s influence on Earth’s climate 
|7|. However teleconnections not only influences climate worldwide but also 

H. F. Astudillo and F. A. Borotto 

Departamento de Fisica, Universidad de Concepcion, P.O. Box 160-C, Concepcion, Chile 
R. Abarca-del-Rio 

Departamento de Geofisica, Universidad de Concepcion, P.O. Box 160-C, Concepcion, Chile 





2 


influences economic, political and social related aspects of the Earth system 
i, 0, pm, m. Thus, ENSO signal is a key index representing the complex 
dynamics of the Earth system as a whole [12] . It is therefore understandable 
that comprehension of its dynamics is one of the most important scientific 
milestones today and in parallel its prediction is one of the major challenges, 
and no longer an element solely of interest to climatologists alone. 

International efforts to forecast ENSO using both dynamical and statistical 
methods, have met with some success. Thus, with lead times up to 6 mo, 
various forecasts performed reasonably well, whereas for longer lead times, 
that is passing through the boreal spring barrier, the performance becomes 
rather low, particularly for the most extreme events such as El Nino 1982-83 
and 1997-98’s [13], |14j . However recent results [T5] highlight potential skill 
in predicting tropical Pacific variability at lead times exceeding 1 year for 
some events m- This is an evidence that the ability to predict ENSO events 
for periods longer than one year and perhaps particularly those leading to 
greater consequences worldwide is also feasible. However, the previous stage 
before prediction, is to first show that this goal is achievable. That is, there is 
at least enough information within the oscillatory system, as represented by 
the main ENSO index, the Southern Oscillation Index (SOI), that may allow 
prediction of ENSO events. 

The main purpose of this paper is to report that such evidence exists. In 
a precedent article m , a methodology for applying Takens’s embedding the¬ 
orem 118j to reconstruct an event is developed, by taking into account just 
the local information contained in a short time series. The physical basis of 
the method lies in the definition of the local predictability [19]. Local pre¬ 
dictability limit gives a measure of long time-scale local predictability on the 
attractor [22] ■ These results outlined the capabilities of the methodology which 
this local reconstruction method uses to characterize the local and non-linear 
properties of a given system dynamics. Thus, with the knowledge of the vari¬ 
ability over the past 20th century, it was possible to reconstruct local extreme 
ENSO events (1982-83 and 1997-98), leading by more than two years in ad¬ 
vance. Likewise, the method can be applied to arbitrarily large sizes but the 
search for the relevant embedding state, or - space subspaces that character¬ 
ize the phenomena requires a high computational cost, hindering the use of 
the method, and thereby reproduction of results. A simpler but complemen¬ 
tary approach allows ease reproducibility, allowing to reconfirm the precedent 
study. Thanks to a significant breakthrough that overcame the computational 
and technical difficulties involved in the application of the method to embed 
a signal of limited temporal extension in a state space. As m it has already 
been reported that the wavelet space and the embedding space are equivalent 
in topology, therefore a description is possible by using under Takens’s embed¬ 
ding theorem, which allows to reconstruct a state space which is diffeomorphic 
to the physical phase space from a time series [22]. Thus, in this analysis, the 
SOI is considered as a macroscopic variable that has been embedded in the 
n-dimensional state space. So that, we assume that the SOI represents a vari¬ 
able that is a part of non-linear differential equations describing the oscillation. 
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Extending HU to the problem under consideration, a wavelet decomposition 
separates the different n state coordinates from the signal equivalently as de¬ 
scribed in [23j and [24]. Thus liberating us from the tedious searching process 
for the optimal state spaces as was done previously. Thus, the process’s pre¬ 
sented in the precedent investigation can be eased, making it more attractive 
and straightforward for its application and reproducibility. 

Thus, the methodology presented here allows to search whithin the whole 
length of the times series non linear trajectories that do present the same prop¬ 
erties that the non linear trajectories of the attractor of a particular event in¬ 
vestigated. Therein it is possible using the sole non-linear information already 
present whitin a time series to find similar attractor-trajectories that allows 
to investigate if a particular ENSO event is reproducible to a certain extent. 


2 Methodology 

To achieve this goal we first define an n-dimensional state space E n . The evo¬ 
lution of the system in this state space is recorded in a state trajectory given 
by r n (t) = ..., S n (t)), where quantities ..., S n (t) are obtained by 

wavelet decomposition of the SOI signal [25]. From the original signal, a given 
number n of component signals Si are obtained by means of wavelet filtering 
the monthly SOI index between the years 1866-2013. We decompose the SOI 
data using a Discrete Wavelet Transformation (DWT) multiresolution algo¬ 
rithm with a Morlet mother wavelet function m, ehi- In our case we choose 
an 8 level decomposition, considering the monthly time series resolution. Be¬ 
cause fewer levels did not allow to separate efficiently the interannual time 
scales or capture interdecadal time scales efficiently that characterize ENSO 
[3], so each of the components (S',, i from 1 to 8) represent a given bandwidth 
(1-2 intra-seasonal, 3-5 interannual, 6-8, decadal to interdecadal) whereas the 
last component Sg represents lowest frequencies, that describes the trend. In 
other words SOI = Si + .-. + Sg. In this paper we use a state space Eg following 
m encrypting all information contained in the SOI in the state trajectory, 
but we also define a six dimensional subspace, Eg where the state trajectory 
is given by rg(f) = (£ 4 ( 4 ),..., Sg(t)) containing information only in the low 
frequency range (lower than 1/yr). We compute the low frequency part of the 
SOI signal as Sl = 64 + ... + Sg. 

The event (An El Nino or La Nina event) to be predicted 5 l 4 is selected 
from Sl (represented by the red segment in Figure |TJi) . The first point in Sl 4 
(represented by a red dot in Figure [ljr) is the starting point at time t sp and 
the terminal point is located by T steps in the future. To select another part of 
the signal Sl that enables to reproduce the relevant features of the 57,; event, 
we seek a state point of the state trajectory in the attractor around which 
the embedded signal SOI has the same topological features like than around 
the starting point of the event. Thus, both state points are located about the 
same region in the attractor. Figure [TJr and Figure [TJl shows bidimensional 
projections of the state trajectory (in orange) around the attractor, while red 
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and magenta colors plots shows the selected event and the most similar event, 
respectively. To accomplish this task we calculate the Euclidean distance of 
all state points in the state trajectory to the starting state point. In the Eg 
reconstructed state space the Euclidean distance between each state point of 
state trajectory to the starting state point is given by d(t) =|| r 9 (t) — r 9 (f sp ) ||, 
which is plotted in blue in Figure [Ox Then, we collect the times, tj, where the 
curve d = d(t) exhibits local minima. Each selected local minimum of the 
function d(t) corresponds to a state point on each orbit that is closest to 
the starting state point in the reconstructed state space E 9 . This search is 
performed away from a region of T steps from each side (to the past and 
the future) of the starting point, t sp (regions in black in Figure [T]:). For each 
of the collected times, tj a piece of the corresponding Sl of T steps long is 
extracted into its own future (that is Sl 3 )- As shown in Figure Q!i, the Sl 3 
piece (magenta) is then moved to the starting point, t sp (the red dot) through 
a temporal translation (the magenta arrow). In addition, the translated signal 
(the Slj piece), as an amplitud gauge (the red arrow in Figured]), is shifted up 
or down so that its first point (originally at tj ) matches the value of Sl at the 
starting point at time t sp (the event to be predicted, that is Sl,). In addition, 
the physical consistency implies that all the selected Sl 3 pieces must contain 
the same number of maxima and minima than the chosen event in Sl (Sx 4 ). 
Finally, of all the Sl, pieces selected and translated as described, the piece 
that contains the same number of maxima and the same number of minima 
as iSx 4 which produces the least quadratic difference is definitely chosen (the 
magenta dot in Figured])- 

To make a robust estimate of the event we perform an average of six con¬ 
secutive months of predictions resulting a T e ff = T — 12 months prediction, 
which is plotted in green in Figured] after a necessary second amplitude gauge. 
Thus, the averaged prediction begins on the date of the last prediction used 
for constructing the 6 monthly mean that has the final date of the first pre¬ 
diction used. Thus on average an initial three years local prediction reduces 
to an mean prediction of about two years. 

Finally we compute the error of the prediction of these two year long pre¬ 
diction by normalizing to the standard deviation of Sl- The error is reported 
as a bar below the signal. The red line in Figures [2] to [5] indicate the thresh¬ 
old of 0.5 which means that the error of the prediction reach the half of the 
standard deviation (SD) of the whole signal Sl- 


3 Results 

Remarkably (Figures [2] [3] and |4]) one of the most extreme event of the 20th 
century, namely the 1997-98 Nino event could be clearly anticipated well be¬ 
yond a two year lead to the least. Thus it is possible to find trajectories in 
the SOI time series that present the same nonlinear characteristics than the 
variability of the 1997-98 event. Thus, the relative error for the two year lead 
Figurell]) is below the 0.5 threshold. Instead for the 1982-83 event, the relative 
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error is well above the 0.5 threshold. That is, meaning that it is not possible 
to find any correct trajectory/attractor within the whole 1878-2013 time span 
that present the same non-linear properties than the 1982-83 event. 

The analysis of relative error histograms (Figure (5jr) shows that when the 
lead times increases (from 2 to 4 years), the averaged relative error indeed 
increases, from 0.177, 0.298 and 0.389 for T e ff = 2,3,4 years respectively (see 
Figure [5jr and Table [T]). 

Indeed while increasing the lead times, the error for the particular 1982-83 
and 1997-98 events increases. However, for the El Nino 1997-98, while increas¬ 
ing the lead times, the relative error always falls below the 0.5 threshold (lesser 
than 0.37). The orbit/trajectories that are closer to an event and that allows 
the accurate averaged prediction, changes depending on lead time (and can 
change from one month to another). However, the analysis of Figure [5}:) shows 
that the histogram of closer trajectories is a Gaussian-shaped distribution, 
that is, in general the non-linear prediction always finds a trajectory in the 
vicinity of the event in the state space. 

Thus, if we had to look for the trajectories used for the construction of the 
2 year lead prediction of the particular 1997-98 event, throughout the span 
of the event, these would be those of the 1911, 1972-73 and 1982-83 events. 
Interestingly the orbits/trajectories that naturally emerges for the 1982-83 
event, and eventhough this event rises above the 0.5 threshold, are indeed 
preferentially the 1997-98 (in the future) and also the 1901-02 events. 


4 Discussion and Conclusions 

As already stated the physical support of this methodology consider the dy¬ 
namics of ENSO described as a dynamic system. Thus, the signal, can be con¬ 
sidered as a solution for a nonlinear system of differential equations describing 
the oscillation. In virtue of the Takens’s theorem the signal is embedded in the 
9-dimensional state space Eg so that the relevant information contained in the 
system is encrypted in a state trajectory of the system. With this procedure 
the state points of the state trajectory, ie, the states of the system are ar¬ 
ranged in orbits around an attractor. The idea behind this methodology relies 
on possibility to find in past or future evolutions of a time series events that 
present the same characteristics (orbits around the attractor) than the event 
investigated. The closest orbits increases the capacity of event predictability. 

Once an event Sl ( is selected in the signal Sl, the starting point at time 
t sp and a time interval of prediction T is determined. The starting point of the 
selected event corresponds to a system state point on the attractor. To find 
an event in the signal, SV ■ , corresponding to a similar solution of the same set 
of deferential equations, we determine the set of state points t spj that are in 
a region around the starting state point in the attractor (in Eg). We remark 
that the reconstructed state space Eg is diffeomorphic to the physical phase 
space m- 
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The selection of the most similar piece of the event signal is by performing 
the choicest among all segments of length T, starting at state points that are 
in vicinity of the starting state point of the event. These datas provide the 
least deviation standard with the same number of peaks and the same number 
of minima as the event signal. 

The most important result of this study is that the so-called SOI anomaly 
corresponds to the dynamics of a nolinear oscillator having complex regular¬ 
ities and exhibits an acceptable level of accuracy of average non-linear pre¬ 
dictability in the range between 2 and 4 years of time span. Although the 
topology of the attractor is unknown, for each event it is always possible to 
find an orbit corresponding to another event that ocurs in the past or in the 
future of the starting point which has the same category. The knowledge of 
the topological structure of the attractor is a necessary task for forecasting. 
In our method, as discussed previously, we chose the best orbit in the attrac¬ 
tor. The histograms at the bottom of Figures [5jr and [SJd shows that there is 
no deterministic rule that allows us to choose the right orbit with certainty, 
although a clear trend towards the nearest orbit was observed. 

Note that indeed the methodology here used, as it uses a wavelet filter 
bank with different scales does take in account, when constructing the orbit 
attractor for a particular event, a part of the closest (past or future) evolution 
of the time series. However, this is performed only for constructing the attrac¬ 
tor and search for similitudes within the whole length of the time series and 
not for constructing the prediction. Therefore, this lapse is already existing 
and is not built in any way with the method. In addition, the methodology 
find trajectories that are far away in time from the event analysed. Also, in a 
precedent paper m we already demonstrated that the methodology worked 
although search for relevant embedding space was tedious and with high com¬ 
puter cost. Here we only extend the methodology, thanks to the wavelet space 
to a set of embedding state spaces that indeed facilitates the search for clos¬ 
est orbits and therefore for similar trajectories within the length of the time 
series. In other words the wavelet filtering bank does not interfere with the 
time selected. Finally the method is not filter dependent. Thus, we tested the 
methodology with two different filtering banks, that is with Empirical modal 
decomposition EMD ,30j and Singular Spectrum analysis SSA [22]. In both 
cases, and particularly for SSA, the methodology could reconstruct correctly 
most events throughout the 20th century but failed for the 1982/83 event. 

Although we do not know exactly the selection rule to determine the op¬ 
timal orbits that allows forecasting that does not invalidate the fact that the 
SOI signal has built enough information to reproduce events over at least a 2 
years span, surpassing throughout the 20th century, with an acceptable error, 
with exception of the 1982-1983 event. 
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Table 1 Mean, Variance, Skweness and Kurtosis of normalized errors distributions shown 
in Figures [5^, and(5j^- 
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4.080 
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Year 


Fig. 1 Figure details the steps to be carried out to determine which part of the signal Sl 
that is most similar to the event to be predicted. In panel a the Sl signal is plotted. The 
red dot is the starting point of the event that corresponds to the first starting state point in 
Figure[2| while the red section indicates the extent of the event to be predicted. The magenta 
point indicates the point corresponding to the nearest point on the chosen orbit. In panel 
c the blue line is the Euclidean distance between the starting point and each point of the 
state trajectory calculated in Eg, while the green curve is the Euclidean distance between 
the starting point and each point of the path that was calculated in E6- In panels b and 
d we plot (S 8 ,Sq) and (S's, S 4 ) two-dimensional projections of the attractor, respectively. 
The orange curves are projections of the state trajectory. 
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Year 


Fig. 2 Above: The Figure outlines the steps followed in the procedure for obtaining an av¬ 
erage prediction for two years. Points in black are the monthly amplitudes of low-frequency 
SOI index ( Sl )• The blue points are the starting points for each prediction for three years, 
which are drawn with blue lines.The green curve is the two years average nonlinear predic¬ 
tion. Below: The red line marks the boundary where the mean square error of the average 
nonlinear prediction is half the standard deviation of Sl ■ The green bar at the bottom indi¬ 
cates the mean square error relative to the standard deviation of Sl of the average nonlinear 
prediction drawn in green. 



1994 1995 1996 1997 1998 1999 

Year 


Fig. 3 Above: Average non-linear predictions for two years lead (in green) and the Sl 
time series (in black) are drawn. Below: The relative error. The red line indicates the 0.5 
threshold. The green, cyan, and orange bars are the average non linear predictions error for 
T e ff = 2, 3 and 4 years, respectively. 
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Year 


Fig. 4 Above: The performance of the procedure for average non-linear predictions with 
T e ff = 2 years of El Nino during the 20th century. Below: The relative error. The red line 
indicates the 0.5 threshold. 
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Fig. 5 a) Histograms of the relative errors of the average non-linear predictions of El Nino 
during the 20th century are shown. Relative errors for 2, 3 and 4 years are depicted in green, 
cyan and orange colors, respectively, b) the histograms of the rank of the selected orbit for 
each non-linear prediction is been represented. The histogram of orbit rank for 2, 3 and 4 
years are represented using green, cyan and orange colors, respectively. 










